Simple features for statistical Word Sense Disambiguation

نویسندگان

  • Abolfazl Keighobadi Lamjiri
  • Osama El Demerdash
  • Leila Kosseim
چکیده

In this paper, we describe our experiments on statistical word sense disambiguation (WSD) using two systems based on different approaches: Näıve Bayes on word tokens and Maximum Entropy on local syntactic and semantic features. In the first approach, we consider a context window and a sub-window within it around the word to disambiguate. Within the outside window, only content words are considered, but within the sub-window, all words are taken into account. Both window sizes are tuned by the system for each word to disambiguate and accuracies of 75% and 67% were respectively obtained for coarse and fine grained evaluations. In the second system, sense resolution is done using an approximate syntactic structure as well as semantics of neighboring nouns as features to a Maximum Entropy learner. Accuracies of 70% and 63% were obtained for coarse and fine grained evaluations.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

رفع ابهام معنایی واژگان مبهم فارسی با مدل موضوعی LDA

Word sense disambiguation is the task of identifying the correct sense for the word in a given context among a finite set of possible sense. In this paper a model for farsi word sense disambiguation is presented. The model use two group of features: first, all word and stop words around target word and topic models as second features. We extract topics from a farsi corpus with Latent Dirichlet ...

متن کامل

Simple Features for Chinese Word Sense Disambiguation

In this paper we report on our experiments on automatic Word Sense Disambiguation using a maximum entropy approach for both English and Chinese verbs. We compare the difficulty of the sensetagging tasks in the two languages and investigate the types of contextual features that are useful for each language. Our experimental results suggest that while richer linguistic features are useful for Eng...

متن کامل

Combining Lexical and Syntactic Features for Supervised Word Sense Disambiguation

The success of supervised learning approaches to word sense disambiguation is largely dependent on the features used to represent the context in which an ambiguous word occurs. Previous work has reached mixed conclusions; some suggest that combinations of syntactic and lexical features will perform most effectively. However, others have shown that simple lexical features perform well on their o...

متن کامل

GPLSI: Word Coarse-grained Disambiguation aided by Basic Level Concepts

We present a corpus-based supervised learning system for coarse-grained sense disambiguation. In addition to usual features for training in word sense disambiguation, our system also uses Base Level Concepts automatically obtained from WordNet. Base Level Concepts are some synsets that generalize a hyponymy sub–hierarchy, and provides an extra level of abstraction as well as relevant informatio...

متن کامل

Potts Model on the Case Fillers for Word Sense Disambiguation

We propose a new method for word sense disambiguation for verbs. In our method, sense-dependent selectional preference of verbs is obtained through the probabilistic model on the lexical network. The meanfield approximation is employed to compute the state of the lexical network. The outcome of the computation is used as features for discriminative classifiers. The method is evaluated on the da...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004